-
Notifications
You must be signed in to change notification settings - Fork 68
PTDT-3807: Add temporal audio annotation support #2013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
PTDT-3807: Add temporal audio annotation support #2013
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
2e8f828
to
1174ad8
Compare
8e06a7a
to
2361ca3
Compare
3e51273
to
59f0cd8
Compare
d186b38
to
b186359
Compare
f0a0723
to
e63b306
Compare
0683dfd
to
6b54e26
Compare
if all_nested: | ||
entry["classifications"] = self._serialize_explicit_classifications(all_nested, root_frames) | ||
entries.append(entry) | ||
return entries[0] if len(entries) == 1 else {"options": entries, "frames": frames} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Single Option Checklist Loses Frame Data
The _create_answer_entry
method's checklist handling returns a single option directly when only one is present. This loses the parent annotation's frame information and creates an inconsistent output structure compared to checklists with multiple options.
pass | ||
else: | ||
# Both implicit - merge | ||
seen_values[value_key]["frames"].extend(opt_frames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Frame Merging Fails in Radio Answer Serialization
The frame merging logic for radio answers in _serialize_explicit_classifications
has issues when combining explicit and implicit frame definitions. It can silently drop implicit frames or replace existing implicit frames with new explicit ones, rather than merging all relevant frame ranges. This leads to data loss.
elif root_frames and root_frames[0] is not None and root_frames[1] is not None: | ||
return [{"start": root_frames[0], "end": root_frames[1]}] | ||
else: | ||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Frame Range Parsing Fails for Single Frames
The _get_nested_frames
method incorrectly determines frame ranges. It requires both start_frame
and end_frame
to be explicit, missing valid single-frame annotations where end_frame
is None
. It also misinterprets an empty parent_frames
list, leading to incorrect frame assignments for nested classifications.
Here's the updated PR description that reflects the refactoring work we've completed:
Description
This PR introduces Audio Temporal Annotations - a new feature that enables precise time-based annotations for audio files in the Labelbox SDK. This includes support for temporal classification annotations with millisecond-level timing precision.
Motivation: Audio annotation workflows require precise timing control for applications like:
Context: This feature extends the existing audio annotation infrastructure to support temporal annotations, using a millisecond-based timing system that provides the precision needed for audio applications while maintaining compatibility with the existing NDJSON serialization format.
Type of change
All Submissions
New Feature Submissions
Changes to Core Features
Summary of Changes
New Audio Temporal Annotation Types
AudioClassificationAnnotation
: Time-based classifications (radio, checklist, text) for audio segmentsCore Infrastructure Updates
TemporalFrame
,AnnotationGroupManager
,ValueGrouper
, andHierarchyBuilder
componentstemporal.py
module with generic components that can be reused for video, audio, and other temporal annotation typesCode Architecture Improvements
Generic[TemporalAnnotation]
for compile-time type checkingframe_extractor
callable allows different annotation types to use the same processing logicoverlaps()
method and improved temporal containment logiccreate_audio_ndjson_annotations()
convenience functionTesting
test_v3_serialization.py
(attached at the bottom) that validates both structure and valuesDocumentation & Examples
audio.ipynb
with temporal annotation examplesdemo_audio_token_temporal.py
showing per-token temporal annotationsSerialization & Import Support
Key Features
Precise Timing Control
Per-Token Temporal Annotations
Ontology Setup for Temporal Annotations
Label Integration
Technical Architecture
Generic Temporal Components
The refactored architecture provides reusable components for any temporal annotation type:
This feature enables the Labelbox SDK to support precise temporal audio annotation workflows while providing a robust, reusable architecture for future temporal annotation types. The modular design ensures maintainability and extensibility while preserving full backward compatibility.
Click to expand: Python Script
Note
Introduces temporal audio classification annotations with millisecond frames, NDJSON serialization, tests, and example updates.
AudioClassificationAnnotation
withstart_frame
/end_frame
; export inannotation_types.__init__
.ClassificationAnswer
andClassificationAnnotation
to optionally carrystart_frame
/end_frame
.Label.frame_annotations()
to includeAudioClassificationAnnotation
frames.data/serialization/ndjson/temporal.py
(grouping/nesting helpers).create_audio_ndjson_annotations
; integrate inNDLabel.from_common
and exclude audio from non-video path.tests/data/serialization/ndjson/test_audio.py
covering nested text/radio/checklist and frame ranges.examples/annotation_import/audio.ipynb
with temporal annotations (token-level), ontology INDEX-scope example, and MAL upload flow.examples/README.md
tables (basics, exports, annotation import, integrations, model experiments, prediction upload).Written by Cursor Bugbot for commit b0d5ee4. This will update automatically on new commits. Configure here.